26 research outputs found
Neural Motifs: Scene Graph Parsing with Global Context
We investigate the problem of producing structured graph representations of
visual scenes. Our work analyzes the role of motifs: regularly appearing
substructures in scene graphs. We present new quantitative insights on such
repeated structures in the Visual Genome dataset. Our analysis shows that
object labels are highly predictive of relation labels but not vice-versa. We
also find that there are recurring patterns even in larger subgraphs: more than
50% of graphs contain motifs involving at least two relations. Our analysis
motivates a new baseline: given object detections, predict the most frequent
relation between object pairs with the given labels, as seen in the training
set. This baseline improves on the previous state-of-the-art by an average of
3.6% relative improvement across evaluation settings. We then introduce Stacked
Motif Networks, a new architecture designed to capture higher order motifs in
scene graphs that further improves over our strong baseline by an average 7.1%
relative gain. Our code is available at github.com/rowanz/neural-motifs.Comment: CVPR 2018 camera read
PIQA: Reasoning about Physical Commonsense in Natural Language
To apply eyeshadow without a brush, should I use a cotton swab or a
toothpick? Questions requiring this kind of physical commonsense pose a
challenge to today's natural language understanding systems. While recent
pretrained models (such as BERT) have made progress on question answering over
more abstract domains - such as news articles and encyclopedia entries, where
text is plentiful - in more physical domains, text is inherently limited due to
reporting bias. Can AI systems learn to reliably answer physical common-sense
questions without experiencing the physical world? In this paper, we introduce
the task of physical commonsense reasoning and a corresponding benchmark
dataset Physical Interaction: Question Answering or PIQA. Though humans find
the dataset easy (95% accuracy), large pretrained models struggle (77%). We
provide analysis about the dimensions of knowledge that existing models lack,
which offers significant opportunities for future research.Comment: AAAI 202